Competition and fragmentation: a simple model generating lognormal-like distributions
نویسندگان
چکیده
The current distribution of language size in terms of speaker population is generally described using a lognormal distribution. Analyzing the original real data we show how the double-Pareto lognormal distribution can give an alternative fit that indicates the existence of a power law tail. A simple model, based on competition and fragmentation, reproduces such behavior and is able to well approximate real data. Competition and fragmentation: a simple model generating lognormal-like distributions2 The astonishing similarity between biological and language evolution has attracted the interest of researchers familiarized to analyze genetic properties in biological populations with the aim of describing problems of linguistics [1, 2, 3]. Their techniques, for instance, have succeeded in explaining some interesting features related to the coexistence of the about 7, 000 languages present on Earth. More recently, the effort to bring evolutionary biology inquiry in contact with linguistics has emerged into a study that describes the effects of competition between languages on language evolution. The work of Abrams and Strogatz in 2003 [4], who analyzed the stability of a system composed of two competing languages, can be considered as the starting point of this new research line. In the following years, other groups, simultaneously, developed new analytical and computational models [5, 6, 7, 8, 9, 10, 11, 12]. An overview of the fast increasing literature on language competition can be found in refs. [13, 14]. Languages are by no way static. They continuously evolve, changing, for example, their lexicon, phonetic, and grammatical structure. This evolution is similar to the evolution of species driven by mutations and natural selection [15]. Following the common picture of biology, changes in the language structure may be seen as the result of microscopic stochastic changes caused by mutations. Natural selection, which may be caused by competition among individuals, positively selects some of these small changes, in dependence of their reproductive success. A sequence of macroscopic observations corresponds to such a microscopic picture. In language evolution, these macroscopic events are, for instance, the origination of two languages from an ancestor one for example, the emergence of the Romance languages from Latin or the extinction of a language. In this work we model the evolution of languages from a macroscopic point of view. More precisely, the microscopic processes responsible for the differentiation of one language into two new languages are not implemented here. Effectively, we neglect the microdynamics that generates language changes, at the level of individuals, and we just describe their effect on extinction and differentiation at the level of languages, throughout a phenomenological mechanism of growth and fragmentation. Language change is determined by the dynamics of the size of its population. The fact that rare languages are less attractive for people to both learn and use is the mechanism considered as the origin of these population size changes. Consequently, this mechanism introduces a sort of frequency dependent reproductive success for different languages. Statistical data supporting this thesis can be found in Ref. [16]. Languages documented as declining are negatively correlated with population size. This phenomenon is similar to the Allee effect in biology [17]. With a simple computational model, based on the above described mechanisms, we compare simulation results with empirical data of the distribution of population sizes (DPL) of Earth’s actually spoken languages [18] (see Fig. 1). Several attempts have recently been made to reproduce the DPL. Two works focused on the apparently lognormal shape of the DPL. Tuncay [19] described language Competition and fragmentation: a simple model generating lognormal-like distributions3 differentiation by means of a process of successive fragmentations, in combination with a multiplicative growth process. In a recent paper by Zanette [20], the dynamics of language evolution is considered as a direct consequence of the demographic increase of the speaker populations, which is modeled by means of a simple multiplicative process. Unsurprisingly, these models obtain pure lognormal distributions for the DPL, as expected from the application of the central limit theorem for multiplied random variables [21, 22]. Unfortunately, the DPL is known to significantly differ from a pure lognormal shape [16]. The Schulze model [8] relies on ideas already successfully applied to model biological evolution. Languages are identified by a bit string that represents their characteristic features. New languages are produced by mutations of these features and small languages are discriminated by competition. These simulations, during the transient towards the stationary state, are able to generate data with a distribution similar to the DPL. A review of the Schulze model and its application to different problems connected to language interaction can be found in Ref. [14]. Another model, the Viviane model, simulates human settlement on an unoccupied region. Languages suffer local mutations, until the available space becomes completely populated [12]. The combination of the Viviane model with the bit string approach led to new results which well reproduced the DPL over almost the entire range, except for large language sizes [23]. A thorough look at the DPL may suggest that the deviations from the lognormal shape could be due to power law decays. We investigate this idea by fitting the DPL with a double-Pareto lognormal distribution and additionally comparing it to the simulation results of our new model. The paper is organized as follows. First we analyze the DPL and show that the double-Pareto lognormal distribution [24, 25] gives an alternative fit. The next section introduces the computational model. The simulation results are presented and the conclusions are given in the last two sections, respectively. 1. Statistical analysis of the distribution of languages Reference [18] provides the number of people speaking a given language as its own mother tongue. Starting from these data, we make a histogram by counting the number of languages with population size enclosed in a bin with values between log(2) and log(2), where n = 0, 1, 2, .... This distribution defines the DPL. A pure lognormal shape can be described by, DPLlognormal(L) = A exp [
منابع مشابه
Fragmentation phenomena in populations of magmatic crystals
Fragmentation of crystals is an important mechanism, and a component of particle dynamics in igneous and metamorphic rocks that has received surprisingly little attention. Recent advances in textural analysis, extraction techniques, digital imaging, and computer-assisted measurements enable rapid accumulation of 3D data on particle shapes and size distributions. This paper reviews fragment size...
متن کاملAn investigation into the population abundance distribution of mRNAs, proteins, and metabolites in biological systems
MOTIVATION Distribution analysis is one of the most basic forms of statistical analysis. Thanks to improved analytical methods, accurate and extensive quantitative measurements can now be made of the mRNA, protein and metabolite from biological systems. Here, we report a large-scale analysis of the population abundance distributions of the transcriptomes, proteomes and metabolomes from varied b...
متن کاملAN OPTIMUM APPROACH TOWARDS SEISMIC FRAGILITY FUNCTION OF STRUCTURES THROUGH METAHEURISTIC HARMONY SEARCH ALGORITHM
Vulnerability assessment of structures encounter many uncertainties like seismic excitations intensity and response of structures. The most common approach adopted to deal with these uncertainties is vulnerability assessment through fragility functions. Fragility functions exhibit the probability of exceeding a state namely performance-level as a function of seismic intensity. A common approach...
متن کاملThe Dynamic Contribution of New Crops to the Agricultural Economy: Is it Predictable?
The diversification of agriculture and the development of new crops are closely related. Although diversification is often justified by reference to practical experience, we know little about the theoretical basis for diversification or its causes and effects. Is it theoretically possible to feed the world on a fixed and unchangeable set of species and varieties? Empirical evidence suggests not...
متن کاملCumulative Lognormal Distributions of Dose-Response vs. Dose Distributions
A review of the author’s findings over four decades will show that the lognormal probability density function can be fit to many types of positive-variate radiation measurement and response data. The cumulative lognormal plot on probability vs. logarithmic coordinate graph paper can be shown to be useful in comparing trends in exposure distributions or responses under differing conditions or ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009